Method and Apparatus for Accomplishing Multiple Description Coding for Video

ABSTRACT

A method and apparatus for utilizing temporal prediction and motion compensated prediction to accomplish multiple description video coding is disclosed. An encoder receives a sequence of video frames and divides each frame into non-overlapping macromacroblocks. Each macromacroblock is then encoded using either an intraframe mode (I-mode) or a prediction mode (P-mode) technique. Both the I-mode and the P-mode encoding techniques produce an output for each of n channels used to transmit the encoded video data to a decoder.

CROSS-REFERENCE TO RELATED PATENT APPLICATIONS

This is a continuation of U.S. patent application Ser. No. 11/050,570,filed Feb. 3, 2005, which is a continuation of U.S. patent applicationSer. No. 10/350,537, filed Jan. 23, 2003, now U.S. Pat. No. 6,920,177,issued Jul. 19, 2005, which is a divisional application of applicationSer. No. 09/478,002, filed Jan. 5/2000, now U.S. Pat. No. 6,556,624,issued Apr. 29, 2003, which claims the benefit of U.S. ProvisionalApplication Ser. No. 60/145,852 entitled Method and Apparatus forAccomplishing Multiple Description Coding for Video, filed Jul. 27,1999.

FIELD OF THE INVENTION

The present invention relates to video coding. More particularly, thepresent invention relates to a method for utilizing temporal predictionand motion compensated prediction to accomplish multiple descriptionvideo coding.

BACKGROUND

Most of today's video coder standards use block-based motion compensatedprediction because of its success in achieving a good balance betweencoding efficiency and implementation complexity.

Multiple Description Coding (“MDC”) is a source coding method thatincreases the reliability of a communication system by decomposing asource into multiple bitstreams and then transmitting the bitstreamsover separate, independent channels. An MDC system is designed so that,if all channels are received, a very good reconstruction can be made.However, if some channels are not received, a reasonably goodreconstruction can still be obtained. In commonly assigned U.S. patentapplication Ser. No. 08/179,416, a generic method for MDC using apairwise correlating transform referred to as (“MDTC”) is described.This generic method is designed by assuming the inputs are a set ofGaussian random variables. A method for applying this method for imagecoding is also described. A subsequent and similarly commonly assignedU.S. Provisional Application Ser. No. 60/145,937, describes ageneralized MDTC method. Papers describing MDC-related work include: Y.Wang et al., “Multiple Description Image Coding for Noisy Channels byPairing Transform Coefficients,” in Proc. IEEE 1997 First Workshop onMultimedia Signal Processing, (Princeton, N.J.), June, 1997; M. T.Orchard et al., “Redundancy Rate Distortion Analysis of MultipleDescription Image Coding Using Pairwise Correlating Transforms,” inProc. ICIP97, (Santa Barbara, Calif.), Octotber, 1997; Y. Wang et al.,“Optimal Pairwise Correlating Transforms for Multiple DescriptionCoding,” in Proc. ICIP98, (Chicago, Ill.), October 1998; and V. A.Vaishampayan, “Design of Multiple Description Scalar Quantizer,” in IEEETrans. Inform. Theory, vol. 39, pp. 821-834, May 1993.

Unfortunately, in existing video coding systems when not all of thebitstream data sent over the separate channels is received, the qualityof the reconstructed video sequence suffers. Likewise, as the amount ofthe bitstream data that is not received increases the quality of thereconstructed video sequence that can be obtained from the receivedbitstream decreases rapidly.

Accordingly, there is a need in the art for a new approach for coding avideo sequence into two descriptions using temporal prediction andmotion compensated prediction to improve the quality of thereconstructions that can be achieved when only one of the twodescriptions is received.

SUMMARY OF THE INVENTION

Embodiments of the present invention provide a block-basedmotion-compensated predictive coding framework for realizing MDC, whichincludes two working modes: Intraframe Mode (I-mode) and Prediction Mode(P-mode). Coding in the P-mode involves the coding of the predictionerrors and estimation/coding of motion. In addition, for both the I-modeand P-mode, the MDTC scheme has been adapted to code a block of DiscreteCosine Transform (“DCT”) coefficients.

Embodiments of the present invention provide a system and method forencoding a sequence of video frames. The system and method receive thesequence of video frames and then divide each video frame into aplurality of macroblocks. Each macroblock is then encoded using at leastone of the I-mode technique and the P-mode technique, where, for nchannels the prediction mode technique generates at least n+1 predictionerror signals for each block. The system and method then provide theI-mode technique encoded data and the at least n+1 P-mode techniqueprediction error signals divided between each of the n channels beingused to transmit the encoded video frame data.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 provides a block diagram of the overall framework for MultipleDescription Coding (“MDC”) of video using motion compensated prediction.

FIG. 2 provides a block diagram of the framework for MDC in P-mode.

FIG. 3A provides a block diagram of the general framework for the MDCPrediction Error (“MDCPE”) encoder of FIG. 2.

FIG. 3B provides a block diagram of the general framework for the MDCPEdecoder of FIG. 2.

FIG. 4 provides a block diagram of an embodiment of the MDCPE encoder ofFIG. 3A.

FIG. 5 provides a block diagram of another embodiment of the MDCPEencoder of FIG. 3A.

FIG. 6 provides a block diagram of another embodiment of the MDCPEencoder of FIG. 3A.

FIG. 7 provides a block diagram of an embodiment of multiple descriptionmotion estimation and coding (“MDMEC”) using spatial interleaving of thepresent invention.

FIG. 8 is a block diagram of an embodiment of an odd-even block encodingof a macroblock in the present invention.

FIG. 9 is a flow diagram representation of an embodiment of the encoderoperations of the present invention.

FIG. 10 is a flow diagram representation of an embodiment of the decoderoperations of the present invention when the decoder receives two codeddescriptions of a video frame.

FIG. 11 is a flow diagram representation of another embodiment of thedecoder operations of the present invention when the decoder onlyreceives one coded description of a video frame.

DETAILED DESCRIPTION The Overall Coding Framework

In accordance with an embodiment of the present invention, a multipledescription (“MD”) video coder is developed using the conventionalblock-based motion compensated prediction. In this embodiment, eachvideo frame is divided into non-overlapping macroblocks which are thencoded in either the I-mode or the P-mode. In the I-mode, the colorvalues of each of the macroblocks are directly transformed using aDiscrete Cosine Transform (“DCT”) and the resultant quantized DCTcoefficients are then entropy coded. In the P-mode, a motion vectorwhich describes the displacement between the spatial position of thecurrent macroblock and the best matching macroblock, is first found andcoded. Then the prediction error is coded using the DCT. Additional sideinformation describing the coding mode and relevant coding parameters isalso coded.

An embodiment of an overall MDC framework of the present invention isshown in FIG. 1 and is similar to the conventional video coding schemeusing block-based motion compensated predictive coding. In FIG. 1, aninput analog video signal is received in an analog-to-digital (“A/D”)converter (not shown) and each frame from the input analog video signalis digitized and divided into non-overlapping blocks of approximatelyuniform size as illustrated in FIG. 8. Although shown as such in FIG. 8,the use of non-overlapping macroblocks of approximately uniform size isnot required by the present invention and alternative embodiments of thepresent invention are contemplated in which non-overlapping macroblocksof approximately uniform size are not used. For example, in onecontemplated alternative embodiment, each digitized video frame isdivided into overlapping macroblocks having non-uniform sizes. Returningto FIG. 1, each input macroblock X 100 is input to a mode selector 110and then the mode selector selectively routes the input macroblock X 100for coding in one of the two modes using switch 112 by selecting eitherchannel 113 or channel 114. Connecting switch 112 to channel 113 enablesI-mode coding in an I-mode MDC 120, and connecting switch 112 withchannel 114 enables P-mode coding in a P-mode MDC 130. In the I-mode MDC120, the color values of the macroblock are coded directly into twodescriptions, description 1 122 and description 2 124, using either theMDTC method; the generalized MDTC method described in co-pending U.S.patent application Ser. No. 08/179,416; Vaishampayan's MultipleDescription Scalar Quantizer (“MDSQ”); or any other multiple descriptioncoding technique. In the P-mode MDC 130, the macroblock is firstpredicted from previously coded frames and two (2) descriptions areproduced, description 1 132 and description 2 134. Although shown asbeing output on separate channels, embodiments of the present inventionare contemplated in which the I-mode description 1 122 and the P-modedescription 1 132 are output to a single channel. Similarly, embodimentsare contemplated in which the I-mode description 2 124 and the P-modedescription 2 134 are output to a single channel.

In FIG. 1, the mode selector 110 is connected to a redundancy allocationunit 140 and the redundancy allocation unit 140 communicates signals tothe mode selector 110 to control the switching of switch 112 betweenchannel 113 for the I-mode MDC 120 and channel 114 for the P-mode MDC130. The redundancy allocation unit 140 is also connected to the I-modeMDC 120 and the P-mode MDC 130 to provide inputs to control theredundancy allocation between motion and prediction error. A ratecontrol unit 150 is connected to the redundancy allocation unit 140, themode selector 110, the I-mode MDC 120 and the P-mode MDC 130. A set offrame buffers 160 is also connected to the mode selector 110 for storingpreviously reconstructed frames from the P-mode MDC 130 and forproviding macroblocks from the previously reconstructed frames back tothe P-mode MDC 130 for use in encoding and decoding the subsequentmacroblocks.

In an embodiment of the present invention, a block-based uni-directionalmotion estimation method is used, in which, the prediction macroblock isdetermined from a previously decoded frame. Two types of information arecoded: i) the error between the prediction macroblock and the actualmacroblock, and ii) the motion vector, which describes the displacementbetween the spatial position of the current macroblock and the bestmatching macroblock. Both are coded into two descriptions. Because thedecoder may have either both descriptions or one of the twodescriptions, the encoder has to take this fact into account in codingthe prediction error. The proposed framework for realizing MDC in theP-mode is described in more detail below.

Note that the use of I-mode coding enables the system to recover from anaccumulated error due to the mismatch between the reference frames usedin the encoder for prediction and that available at the decoder. Theextra number of bits used for coding in the I-mode, compared to usingthe P-mode, is a form of redundancy that is intentionally introduced bythe coder to improve the reconstruction quality when only a singledescription is available at the decoder. In conventional block-basedvideo coders, such as an H.263 coder, described in ITU-T,“Recommendation H.263 Video Coding for Low Bitrate Communication,” July1995, the choice between I-mode and P-mode is dependent on which modeuses fewer bits to produce the same image reconstruction quality. Forerror-resilience purposes, I-mode macroblocks are also insertedperiodically, but very sparsely, for example, in accordance with anembodiment of the present invention, one I-mode macroblock is insertedafter approximately ten to fifteen P-mode macroblocks. The rate at whichthe I-mode macroblocks are inserted is highly dependent on the videobeing encoded and, therefore, the rate at which the I-mode macroblocksare inserted is variably controlled by the redundancy allocation unit140 for each video input stream. In applications requiring a constantoutput rate, the rate control component 150 regulates the total numberof bits that can be used on a frame-by-frame basis. As a result, therate control component 150 influences the choice between the I-mode andthe P-mode. In an embodiment of the present invention, the proposedswitching between I-mode and P-mode depends not only on the target bitrate and coding efficiency but also on the desired redundancy. As aresult of this redundancy dependence, the redundancy allocation unit140, which, together with the rate control unit 150, determines, i) onthe global level, redundancy allocation between I-mode and P-mode; andii) for every macroblock, which mode to use.

P-mode Coding. In general, the MDC coder in the P-mode will generate twodescriptions of the motion information and two descriptions of theprediction error. A general framework for implementing MDC in the P-modeis shown in FIG. 2. In FIG. 2, the encoder has three separate framebuffers (“FB”), FB0 270, FB1 280 and FB2 290, for storing previouslyreconstructed frames from both descriptions (ψ_(o,k-m)), description one(ψ_(1,k-m)), and description two (ψ_(2,k-m)), respectively. Here, krepresents the current frame time, k−m, m=1, 2, . . . , k, the previousframes up to frame 0. In this embodiment, prediction from more than oneof the previously coded frames is permitted. In FIG. 2, a MultipleDescription Motion Estimation and Coding (“MDMEC”) unit 210 receives asan initial input macroblock X 100 to be coded at frame k. The MDMEC 210is connected to the three frame buffers FB0 270, FB1 280 and FB2 290 andthe MDMEC 210 receives macroblocks from the previously reconstructedframes stored in each frame buffer. In addition, the MDMEC 210 isconnected to a redundancy allocation unit 260 which provides an inputmotion and prediction error redundancy allocation to the MDMEC 210 touse to generate and output two coded descriptions of the motioninformation, {tilde over (m)}₁ and {tilde over (m)}₂. The MDMEC 210 isalso connected to a first Motion Compensated Predictor 0 (“MCP0”) 240, asecond Motion Compensated Predictor 1 (“MCP1”) 220 and a third MotionCompensated Predictor 2 (“MCP2”) 230. The two coded descriptions of themotion information, {tilde over (m)}₁ and {tilde over (m)}₂ aretransmitted to the MCP0 240, which generates and outputs a predictedmacroblock P₀ based on {tilde over (m)}₁, {tilde over (m)}₂ andmacroblocks from the previously reconstructed frames from thedescriptions ψ_(1,k-m), where i=0,1,2, which are provided by framebuffers FB0 270, FB1 280 and FB2 290. Similarly, MCP1 220 generates andoutputs a predicted macroblock P₁ based on {tilde over (m)}₁ from theMDMEC 210 and a macroblock from the previously reconstructed frame fromdescription (ψ_(1,k-m)) one from FB1 280. Likewise, MCP2 230 generatesand outputs a predicted macroblock P₂ based on {tilde over (m)}₂ fromthe MDMEC 210 and a macroblock from the previously reconstructed framefrom description two (ψ_(2,k-m)) from FB2 290. In this generalframework, MCP0 240 can make use of ψ_(1,1,k-m) and ψ_(2,k-m) inaddition to ψ_(o,k-m). MCP0 240, MCP1 220 and MCP2 230 are eachconnected to a multiple description coding of prediction error(“MDCPE”)” unit 250 and provide predicted macroblocks P₀, P₁ and P₂,respectively, to the MDCPE 250. The MDCPE 250 is also connected to theredundancy allocation unit 260 and receives as input the motion andprediction error redundancy allocation. In addition, the MDCPE 250 alsoreceives the original input macroblock X 100. The MDCPE 250 generatestwo coded descriptions of prediction error, {tilde over (E)}₁ and {tildeover (E)}₂, based on input macroblock X 100, P₀ P₁, P₂ and the motionand prediction error redundancy allocation. Description one 132, in FIG.1, of the coded video consists of {tilde over (m)}₁ and {tilde over(E)}₁ for all the macroblocks. Likewise, description two 134, in FIG. 1,consists of {tilde over (m)}₂ and {tilde over (E)}₂ for all themacroblocks. Exemplary embodiments of the MDMEC 210 and MDCPE 250 aredescribed in the following sections.

Multiple Description Coding of Prediction Error (MDCPE)

The general framework of a MDCPE encoder implementation is shown in FIG.3A. First, the prediction error in the case when both descriptions areavailable, F=X−P₀, is coded into two descriptions {tilde over (F)}₁ and{tilde over (F)}₂. In FIG. 3A, predicted macroblock P₀ is subtractedfrom input macroblock X 100 in an adder 306 and a both description sideprediction error F₀ is input to an Error Multiple Description Coding(“EMDC”) Encoder 330. The encoding is accomplished in the EMDC Encoder330 using, for example, MDTC or MDC. To deal with the case when only thei-th description is received (that is where i=1 or 2) either an encoderunit one (“ENC1”) 320 or an encoder unit two (“ENC2”) 310 takes eitherpre-run length coded coefficients, Δ{tilde over (C)}_(n), Δ{tilde over(D)}_(n), respectively, and a description i side prediction error E_(i),where E_(i)=X−P_(i), and produces a description i enhancement stream{tilde over (G)}_(i). {tilde over (G)}_(i) together with {tilde over(F)}_(i) form a description i. Embodiments of the encoders ENC1 320 andENC2 310 are described in reference to FIGS. 3A, 4, 5, 6 and 7. As shownin FIG. 3A, P₂ is subtracted from input macroblock X 100 by an adder 302and a description two side prediction error E₂ is output. E₂ and Δ{tildeover (D)}_(n) are then input to ENC2 310 and a description twoenhancement stream {tilde over (G)}₂ is output. Similarly, P₁ issubtracted from input macroblock X 100 in an adder 304 and a descriptionone side prediction error E₁ is output. E₁ and Δ{tilde over (C)}_(n) arethen input to ENC1 320 and a description one enhancement stream {tildeover (G)}₁ 322 is output. In an alternate embodiment (not shown),Δ{tilde over (C)}_(n) and Δ{tilde over (D)}_(n) are determined from F₁and F₂ by branching both of the {tilde over (F)}₁ and {tilde over (F)}₂output channels to connect with ENC1 320 and ENC2 310, respectively.Before the branches connect to ENC1 320 and ENC2 310, they each passthrough separate run length decoder units to produce Δ{tilde over(C)}_(n) and Δ{tilde over (D)}_(n), respectively. As will be seen in thedescription referring to FIG. 4, this alternate embodiment requires twoadditional run length decoders to decode {tilde over (F)}₁ and {tildeover (F)}₂ to obtain Δ{tilde over (C)}_(n) and Δ{tilde over (D)}_(n),which had just been encoded into {tilde over (F)}₁ and {tilde over (F)}₂in EMDC encoder 320.

In the decoder, shown in FIG. 3B, if both descriptions, that is, {tildeover (F)} and F₂, are available, an EMDC decoder unit 360 generates{circumflex over (F)}₀ from inputs {tilde over (F)}₁ and {tilde over(F)}₂, where {circumflex over (F)}₀ represents the reconstructed F fromboth {circumflex over (F)}₁ and {tilde over (F)}₂. {circumflex over(F)}₀ is then added to P₀ in an adder 363 to generate a both descriptionrecovered macroblock {circumflex over (X)}₀. {circumflex over (X)}₀ isdefined as {circumflex over (X)}₀=P₀+{circumflex over (F)}₀. When bothdescriptions are available, enhancement streams {tilde over (G)}₁ and{tilde over (G)}₂ are not used. When only description one is received, afirst side decoder (“DEC1”) 370, produces Ê₁ from inputs Δ{tilde over(C)}_(n) and {tilde over (G)}₁ and then Ê₁ is added to P₁ in an adder373 to generate a description one recovered macroblock {circumflex over(X)}₁. The description one recovered macroblock is defined as{circumflex over (X)}₁=P₁+Ê₁. When only description two is received, asecond side decoder (“DEC2”) 380, produces Ê₂ from inputs Δ{tilde over(D)}_(n) and {tilde over (G)}₂ and then Ê₂ is added to P₂ in an adder383 to generate a description two recovered macroblock {circumflex over(X)}₂. The description two recovered macroblock, {circumflex over (X)}₂,is defined as {circumflex over (X)}₂=P₂+Ê₂. Embodiments of the decodersDEC1 370 and DEC2 380 are described in reference to FIGS. 3B, 4, 5, 6and 7. As with the encoder in FIG. 3A, in an alternate embodiment of thedecoder (not shown), Δ{tilde over (C)}_(n) and Δ{tilde over (D)}_(n) aredetermined from {tilde over (F)}₁ and {tilde over (F)}₂ by branchingboth of the {tilde over (F)}₁ and {tilde over (F)}₂ output channels toconnect with ENC1 320 and ENC2 310, respectively. Before the branchesconnect to ENC1 320 and ENC2 310, they each pass through separate runlength decoder units (not shown) to produce Δ{tilde over (C)}_(n) andΔ{tilde over (D)}_(n), respectively. As with the alternate embodimentfor the encoder described above, this decoder alternative embodimentrequires additional run length decoder hardware to extract Δ{tilde over(C)}_(n) and Δ{tilde over (D)}_(n) from {tilde over (F)}₁ and {tildeover (F)}₂ just before Δ{tilde over (C)}_(n) and Δ{tilde over (D)}_(n)are extracted from {tilde over (F)}₁ and {tilde over (F)}₂ in EMDCdecoder 360.

Note that in this framework, the bits used for G_(i), i=1,2 are purelyredundancy bits, because they do not contribute to the reconstructionquality when both descriptions are received. This portion of the totalredundancy, denoted by ρ_(e,2) can be controlled directly by varying thequantization accuracy when generating G_(i). The other portion of thetotal redundancy, denoted by ρ_(e,1), is introduced when coding Fusingthe MDTC coder. Using the MDTC coder enables this redundancy to becontrolled easily by varying the transform parameters. The redundancyallocation unit 260 manages the redundancy allocation between ρ_(e,2)and ρ_(e,1) for a given total redundancy in coding the predictionerrors.

Based on this framework, alternate embodiments have been developed,which differ in the operations of ENC1 320/DEC1 370 and ENC2 310/DEC2380. While the same type of EMDC encoder 330 and EMDC decoder 380described in FIGS. 3A and 3B are used, the way in which {tilde over(G)}_(i) is generated by ENC1 320 and ENC2 310 is different in each ofthe alternate embodiments. These alternate embodiments are describedbelow in reference to FIGS. 4, 5 and 6.

Implementation of the EMDC ENC1 and ENC2 Encoders

FIG. 4 provides a block diagram of an embodiment of multiple descriptioncoding of prediction error in the present invention. In FIG. 4, an MDTCcoder is used to implement the EMDC encoder 330 in FIG. 3A. In FIG. 4,for each 8×8 block of central prediction error P₀ is subtracted from thecorresponding 8×8 block from input macroblock X 100 in an adder 306 toproduce E₀ and then E₀ is input to the DCT unit 425 which performs DCTand outputs N≦64 DCT coefficients. A pairing unit 430 receives the N≦64DCT coefficients from the DCT unit 425 and organizes the DCTcoefficients into N/2 pairs (Ã_(n), {tilde over (B)}_(n)) using a fixedpairing scheme for all frames. The N/2 pairs are then input with aninput, which controls the rate, from a rate and redundancy allocationunit 420 to a first quantizer one (“Q1”) unit 435 and a second Q1 unit440. The Q1 units 435 and 440, in combination, produce quantized pairs(ΔÃ_(n), Δ{tilde over (B)}_(n)). It should be noted that both N and thepairing strategy are determined based on the statistics of the DCTcoefficients and the k-th largest coefficient is paired with the(N-k)-th largest coefficient. Each quantized pair (ΔÂ_(n), Δ{tilde over(B)}_(n)) is then input with a transform parameter β_(n), which controlsa first part of the redundancy, from the rate and redundancy allocationunit 420 to a Pairwise Correlating Transform (“PCT”) unit 445 to producethe coefficients (Δ{tilde over (C)}_(n), Δ{tilde over (D)}_(n)), whichare then split into two sets. The unpaired coefficients are spliteven/odd and appended to the PCT coefficients. The coefficients in eachset, Δ{tilde over (C)}_(n) and Δ{tilde over (D)}_(n), are then runlength and Huffman coded in run length coding units 450 and 455,respectively, to produce {tilde over (F)}₁ and {tilde over (F)}₂. Thus,{tilde over (F)}₁ contains Δ{tilde over (C)}_(n) in coded run lengthrepresentation, and {tilde over (F)}₂ contains Δ{tilde over (D)}_(n) incoded run length representation. In the following, three differentembodiments for obtaining {tilde over (G)}_(i) from FIG. 3A aredescribed. For ease of description, in the descriptions related to thedetailed operation of the ENC1 320 and ENC2 310 in FIGS. 4, 5 and 6,components in ENC2 310 which are analogous to components in ENC1 320 aredenoted as primes. For example, in FIG. 4, ENC1 320 has a DCT component405 for calculating {tilde over (G)}₁ and ENC2 310 has an analogous DCTcomponent 405′ for calculating {tilde over (G)}₂.

In accordance with an embodiment of the present invention, shown in FIG.4, the central prediction error {tilde over (F)}₁ is reconstructed fromΔ{tilde over (C)}_(n) and Δ{tilde over (C)}_(n) is also used to generate{tilde over (G)}₁. To generate {tilde over (G)}₁, Δ{tilde over (C)}_(n)from PCT unit 445 is input to an inverse quantizer (“Q₁ ⁻¹”) 460 anddequantized C coefficients, ΔĈ_(n) are output. A linear estimator 465receives the ΔĈ_(n) and outputs estimated DCT coefficients ΔÂ_(n) ¹ andΔ{circumflex over (B)}_(n) ¹. ΔÂ_(n) ¹ and Δ{circumflex over (B)}_(n) ¹which are then input to inverse pairing unit 470 which converts the N/2pairs into DCT coefficients and outputs the DCT coefficients to aninverse DCT unit 475 which outputs {circumflex over (F)}₁ to an adder403. P₁ is subtracted from each corresponding 8×8 block from inputmacroblock X 100 in the adder 302 and the adder 302 outputs E₁ to theadder 403. {circumflex over (F)}₁ is subtracted from E₁ in the adder 403and G₁ is output. In the absence of any additional information, thereconstruction from description one alone will be P₁+{circumflex over(F)}₁. To allow for a more accurate reconstruction, G₁ is defined asG₁=X−P₁−{circumflex over (F)}₁, and G₁ is coded into {tilde over (G)}₁using conventional DCT coding. That is, G₁ is DCT transformed in a DCTcoder 405 to produce DCT coefficients for G₁. The DCT coefficients arethen input to a quantizer two (“Q₂”) 410, quantized with an input, whichcontrols a second part of redundancy, from the rate and redundancy unit420 in Q₂ 410 and the quantized coefficients are output from Q₂ 410 to arun length coding unit 415. The quantized coefficients are then runlength coded in run length coding unit 415 to produce the descriptionone enhancement stream Ĝ₁.

Also shown in FIG. 4, the central prediction error {tilde over (F)}₂ isreconstructed from Δ{tilde over (D)}_(n) and Δ{tilde over (D)}_(n) isalso used to generate {tilde over (G)}₂. To generate {tilde over (G)}₂,Δ{tilde over (D)}_(n) from PCT unit 445′ is input to Q₁ ⁻¹ 460′ anddequantized D coefficients, Δ{circumflex over (D)}_(n) are output. Alinear estimator 465′ receives the Δ{tilde over (D)}_(n) and outputsestimated DCT coefficients ΔÂ_(n) ² and Δ{circumflex over (B)}_(n) ².ΔÂ_(n) ² and Δ{circumflex over (B)}_(n) ² are then input to inversepairing unit 470′ which converts the N/2 pairs into DCT coefficients andoutputs the DCT coefficients to an inverse DCT unit 475′ which outputs{circumflex over (F)}₂ to an adder 403′. P₂ is subtracted from eachcorresponding 8×8 block from input macroblock X 100 in the adder 304 andthe adder 304 outputs E₂ to the adder 403′. {circumflex over (F)}₂ issubtracted from E₂ in the adder 403′ and G₂ is output. In the absence ofany additional information, the reconstruction from description twoalone will be P₂+{circumflex over (F)}₂. To allow for a more accuratereconstruction, G₂ is defined as G₂=X−P₂−{circumflex over (F)}₂, and G₂is coded into {tilde over (G)}₂ using conventional DCT coding. That is,G₂ is DCT transformed in a DCT coder 405′ to produce DCT coefficientsfor G₂. The DCT coefficients are then input to Q₂ 410′, quantized withan input from the rate and redundancy unit 420 in Q₂ 410′ and thequantized coefficients are output from Q₂ 410′ to a run length codingunit 415′. The quantized coefficients are then run length coded in runlength coding unit 415′ to produce the description two enhancementstream {tilde over (G)}₂.

In accordance with the current embodiment of the present invention, theEMDC decoder 360 in FIG. 3B is implemented as an inverse circuit of theEMDC encoder 330 described in FIG. 4. With the exception of the rate andredundancy unit 420, all of the other components described haveanalogous inverse components implemented in the decoder. For example, inthe EMDC decoder, if only description one is received, the sameoperation as described above for the encoder is used to generate{circumflex over (F)}₁ from Δ{tilde over (C)}_(n). In addition, byinverse quantization and inverse DCT, the quantized version of G₁,denoted by Ĝ₁, is recovered from G₁. The finally recovered block in thisside decoder is X₁ which is defined as X₁=P₁+{circumflex over (F)}₁+Ĝ₁.

In the embodiment of FIG. 4, more than 64 coefficients are needed to becoded in the EMDC 330 and ENC1 320 together. While the use of the 64coefficients completely codes the mismatch error, G₁, subject toquantization errors, it requires too many bits. Therefore, in accordancewith another embodiment of the present invention, only 32 coefficientsare coded when generating Ĝ₁, by only including the error for the Dcoefficients. Likewise, only 32 coefficients are coded when generating{tilde over (G)}₂, by only including C coefficients. Specifically, asshown in FIG. 5, DCT is applied to side prediction error E₁ in the DCTcoder 405, where E₁=X−P₁, and the same pairing scheme as in the centralcoder is applied to generate N pairs of DCT coefficients in pairing unit510.

As in FIG. 4, in FIG. 5, to implement the EMDC encoder 330, a MDTC coderis used. For each 8×8 block of central prediction error, P₀ issubtracted from each corresponding 8×8 block from input macroblock X 100in the adder 306 to produce E₀ and then E₀ is input to the DCT unit 425which performs DCT on E₀ and outputs N≦64 DCT coefficients. In pairingunit 430, the coder takes the N≦64 DCT coefficients from the DCT unit425 and organizes them into N/2 pairs (Ã_(n), {tilde over (B)}_(n))using a fixed pairing scheme for all frames. The N/2 pairs are theninput with an input from the rate and redundancy allocation unit 420 tothe Q1 quantizer units 435 and 440, respectively, and Q1 quantizer units435 and 440 produce quantized pairs (ΔÃ_(n), Δ{tilde over (B)}_(n)),respectively. It should be noted that both N and the pairing strategyare determined based on the statistics of the DCT coefficients and thek-th largest coefficient is paired with the (N-k)-th largestcoefficient. Each quantized pair (ΔÃ_(n), Δ{tilde over (B)}_(n)) isinput with an input from the rate and redundancy allocation unit 420 toa PCT unit 445 with the transform parameter β_(n) to produce thecoefficients (Δ{tilde over (C)}_(n), Δ{tilde over (D)}_(n)), which arethen split into two sets. The unpaired coefficients are split even/oddand appended to the PCT coefficients.

In accordance with an embodiment of the present invention, shown in FIG.5, an estimate of the central prediction error {tilde over (F)}₁ isreconstructed from Δ{tilde over (C)}_(n) and Δ{tilde over (C)}_(n) isalso used to generate {tilde over (G)}₁. To generate {tilde over (G)}₁,{tilde over (C)}_(n) from PCT unit 445 is input to Q₁ ⁻¹ 460 anddequantized C coefficients, ΔĈ_(n) are output to a linear estimator 530.The linear estimator 530 receives the Δ{tilde over (C)}_(n) and outputsan estimated DCT coefficient {circumflex over (D)}_(n) ¹ which is inputto an adder 520. P₁ is subtracted from each corresponding 8×8 block frominput macroblock X 100 in the adder 302 to produce side prediction errorE₁ which is then input to conventional DCT coder 405 where DCT isapplied to E₁. The output of the DCT coder 405 is input to pairing unit510 and the same pairing scheme as described above for pairing unit 430is applied to generate N pairs of DCT coefficients. The N pairs of DCTcoefficients are then input to a PCT unit 515 with transform parameterβ_(n) which generates only the D component, D_(n) ¹. Then, D_(n) ¹ isinput to an adder 520 and {circumflex over (D)}_(n) ¹ is subtracted fromD_(n) ¹ and an error C_(n) ^(⊥) is output. The error C_(n) ^(⊥), whichis defined as C_(n) ^(⊥)=D_(n) ¹−{circumflex over (D)}_(n) ¹, is inputwith an input from the rate and redundancy allocation unit 420 to Q2 525and quantized to produce a quantized error, Ĉ_(n) ^(⊥). The {tilde over(C)}_(n) coefficients from the PCT unit 515 and the quantized error{tilde over (C)}_(n) ^(⊥) are then together subjected to run-lengthcoding in run length coding unit 450 to produce a resulting bitstream{tilde over (F)}₁, {tilde over (G)}₁, which constitutes {tilde over(F)}₁ and {tilde over (G)}₁ from FIG. 3A.

Likewise, an estimate of the central prediction error {tilde over (F)}₂is reconstructed from Δ{tilde over (D)}_(n) and Δ{tilde over (D)}_(n) isalso used to generate {tilde over (G)}₂. To generate {tilde over (G)}₂,{tilde over (D)}_(n) from PCT unit 445′ is input to Q₁ ⁻ 460′ anddequantized D coefficients, Δ{tilde over (D)}_(n) are output to a linearestimator 530′. The linear estimator 530′ receives the Δ{tilde over(D)}_(n) and outputs an estimated DCT coefficient {circumflex over(D)}_(n) ¹ which is input to an adder 520′. P₂ is subtracted from eachcorresponding 8×8 block from input macroblock X 100 in the adder 304 toproduce side prediction error E₂ which is then input to conventional DCTcoder 405′ where DCT is applied to E₂. The output of the DCT coder 405′is input to pairing unit 510′ and the same pairing scheme as describedabove for pairing unit 430 is applied to generate N pairs of DCTcoefficients. The N pairs of DCT coefficients are then input to a PCTunit 515′ with transform parameter β_(n) which generates only the Ccomponent, C_(n) ¹. Then, C_(n) ¹ is input to an adder 520′ and Ĉ_(n) ¹is subtracted from C_(n) ¹ and an error D_(n) ^(⊥) is output. The errorD_(n) ^(⊥), which is defined as D_(n) ^(⊥)=C_(n) ¹−Ĉ_(n) ¹, is inputwith an input from the rate and redundancy allocation unit 420 to Q2525′ and quantized to produce a quantized error, {circumflex over(D)}_(n) ^(⊥). The {tilde over (D)}_(n) coefficients from the PCT unit515′ and the quantized error {circumflex over (D)}n^(⊥) are thentogether subjected to run-length coding in run length coding unit 450′to produce a resulting bitstream {tilde over (F)}₂, {tilde over (G)}₂which constitutes {tilde over (F)}₂ and {tilde over (G)}₂ from FIG. 3A.

In accordance with the current embodiment of the present invention, theDEC1 370 from FIG. 3B is implemented as an inverse circuit of the ENC1320 described in FIG. 4. With the exception of the rate and redundancyunit 420, all of the other components described have analogous inversecomponents implemented in the decoder. For example, in the DEC1 370, ifonly description one is received, which includes, after run lengthdecoding and dequantization, C_(n) and Ĉ_(n) ^(⊥), the PCT coefficientscorresponding to the side prediction error can be estimated by Ĉ_(n)¹=Ĉ_(n), {circumflex over (D)}_(n) ¹={circumflex over (D)}_(n) ¹=D_(n)¹(Ĉ_(n))+Ĉ_(n) ^(⊥). Then inverse PCT can be performed on Ĉ_(n) ¹ and{circumflex over (D)}_(n) ¹, followed by inverse DCT to arrive atquantized prediction error Ê₁. The finally recovered macroblock, X₁, isreconstructed by adding P₁ and Ê₁ together, such that, X₁=P₁+Ê₁.

In another embodiment of the present invention, the strategy is toignore the error in the side predictor and use some additionalredundancy to improve the reconstruction accuracy for the D_(n) in thecentral predictor. This is accomplished by quantizing and coding theestimation error for C_(n) ^(⊥)=Δ{circumflex over (D)}_(n)−{circumflexover (D)}_(n)(Ĉ_(n)), as shown in FIG. 6. This scheme is the same as theit n it generalized PCT, where four variables are used to represent theinitial pair of two coefficients

As in the previously described embodiments, in FIG. 6, to implement theEMDC encoder 330, a MDTC coder is used. For each 8×8 block of centralprediction error, P₀ is subtracted from each corresponding 8×8 blockfrom input macroblock X 100 in the adder 306 to produce E₀ and then E₀is input to the DCT unit 425 which performs DCT on E₀ and outputs N≦64DCT coefficients. A pairing unit 430 receives the N≦64 DCT coefficientsfrom the DCT unit 425 and organizes them into N/2 pairs (Ã_(n), {tildeover (B)}_(n)) using a fixed pairing scheme for all frames. The N/2pairs are then input with an input from the rate and redundancyallocation unit 420 to Q1 quantizer units 435 and 440, respectively, andQ1 quantizer units 435 and 440 produce quantized pairs (ΔÃ_(n), Δ{tildeover (B)}_(n)), respectively. It should be noted that both N and thepairing strategy are determined based on the statistics of the DCTcoefficients and the k-th largest coefficient is paired with the(N−k)-th largest coefficient. Each quantized pair (ΔÃ_(n), Δ{tilde over(B)}_(n)) is input with an input from the rate and redundancy allocationunit 420 to the PCT unit 445 with the transform parameter β_(n) toproduce the PCT coefficients (Δ{tilde over (C)}_(n), Δ{tilde over(D)}_(n)), which are then split into two sets. The unpaired coefficientsare split even/odd and appended to the PCT coefficients.

In accordance with an embodiment of the present invention, shown in FIG.6, {tilde over (C)}_(n) is input to inverse quantizer Q₁ ⁻ 460 anddequantized C coefficients, ΔĈ_(n) are output to a linear estimator 610.The linear estimator 610 is applied to ΔĈ_(n) to produce an estimatedDCT coefficient {circumflex over (D)}_(n) which is output to an adder630. Similarly, {tilde over (D)}_(n) is input to a second inversequantizer Q₁ ⁻ 620 and dequantized D coefficients, Δ{circumflex over(D)}_(n) are also output to the adder 630. Then, {circumflex over(D)}_(n) is subtracted from Δ{circumflex over (D)}_(n) in the adder 630and the error C_(n) ^(⊥) is output. The error C_(n) ^(⊥)=Δ{circumflexover (D)}_(n)−{circumflex over (D)}_(n)(Ĉ_(n)) is input with an inputfrom the rate and redundancy allocation unit 420 to quantizer Q2 640 andquantized to produce Ĉ_(n) ^(⊥). The {tilde over (C)}_(n) coefficientsand the quantized error Ĉ_(n) ^(⊥) are then together subjected torun-length coding in run length coding unit 650 to produce the resultingbitstream {tilde over (F)}₁, {tilde over (G)}₁, which constitutes {tildeover (F)}₁ and {tilde over (G)}₁ from FIG. 3A.

Similarly, in FIG. 6, {tilde over (D)}_(n) is input to inverse quantizerQ₁ ⁻¹ 460′ and dequantized D coefficients, Δ{circumflex over (D)}_(n)are output to a linear estimator 610′. The linear estimator 610′ isapplied to Δ{circumflex over (D)}_(n) to produce an estimated DCTcoefficient Ĉ_(n) which is output to an adder 630′. Similarly, {tildeover (C)}_(n) is input to a second inverse quantizer Q₁ ⁻¹ 620′ anddequantized C coefficients, ΔĈ_(n) are also output to the adder 630′.Then, Ĉ_(n) is subtracted from ΔĈ_(n) in the adder 630′ and the errorD_(n) ^(⊥) is output. The error D_(n) ^(⊥) is input with an input fromthe rate and redundancy allocation unit 420 to quantizer Q2 640′ andquantized to produce {circumflex over (D)}_(n) ^(⊥). The {tilde over(D)}_(n) coefficients and the quantized error {circumflex over (D)}_(n)^(⊥) are then together subjected to run-length coding in run lengthcoding unit 650′ to produce the resulting bitstream {tilde over (F)}₂,{tilde over (G)}₂, which constitutes {tilde over (F)}₂ and {tilde over(G)}₂ from FIG. 3A.

In accordance with the current embodiment of the present invention, theDEC2 decoder 380 decoder from FIG. 3B is implemented as an inversecircuit of the ENC2 encoder 310 described in FIG. 4. With the exceptionof the rate and redundancy unit 420, all of the other componentsdescribed have analogous inverse components implemented in the decoder.For example, the DEC2 decoder 380 operation is the same as in the DEC1decoder 370 embodiment, the recovered prediction error is actually aquantized version of F, so that X₁=P₁+{circumflex over (F)}. Therefore,in this implementation, the mismatch between P₀ and P₁ are left as is,and allowed to accumulate over time in successive P-frames. However, theeffect of this mismatch is eliminated upon each new I-frame.

In all of the above embodiments, the quantization parameter in Q1controls the rate, the transform parameter β_(n) controls the first partof redundancy ρ_(e,1), and the quantization parameter in Q2 controls thesecond part of redundancy ρ_(e,2). In each embodiment, these parametersare controlled by the rate and redundancy allocation component 420. Thisallocation is performed based on a theoretical analysis of the trade-offbetween rate, redundancy, and distortion, associated with eachimplementation. In addition to redundancy allocation between ρ_(e,1) andρ_(e,2) for a given P-frame, the total redundancy, ρ, among successiveframes must be allocated. This is accomplished by treating coefficientsfrom different frames as different coefficient pairs.

Multiple Description Motion Estimation and Coding (MDMEC)

In accordance with an embodiment of the present invention, illustratedin FIG. 7, in a motion estimation component 710, conventional motionestimation is performed to find the best motion vector for each inputmacroblock X 100. In an alternate embodiment (not shown) a simplifiedmethod for performing motion estimation is used in which the motionvectors from the input macroblock X 100 are duplicated on both channels.FIG. 8 shows an arrangement of odd and even macroblocks within eachdigitized frame in accordance with an embodiment of the presentinvention. Returning to FIG. 7, the motion estimation component 710 isconnected to a video input unit (not shown) for receiving the inputmacroblocks and to FB0 270 (not shown) for receiving reconstructedmacroblocks from previously reconstructed frames from both descriptions,ψ_(o,k-1). The motion estimation component 710 is also connected to amotion-encoder-1 730, an adder 715 and an adder 718. Motion-encoder-1730 is connected to a motion-interpolator-1 725 and themotion-interpolator-1 725 is connected to the adder 715. The adder 715is connected to a motion-encoder-3 720. Similarly, motion-encoder-2 735is connected to a motion-interpolator-2 740 and themotion-interpolator-2 740 is connected to the adder 718. The adder 718is connected to a motion-encoder-4 745.

In FIG. 7, the motion vectors for the even macroblocks output from themotion estimation unit 710, denoted by m₁, are input to Motion-Encoder-1730, and coded to yield {tilde over (m)}_(1,1) and reconstructed motions{circumflex over (m)}_(1,1). The reconstructed motions, {circumflex over(m)}_(1,1), are input to motion interpolator-1 725 which interpolatesmotions in odd macroblocks from the coded ones in even macroblocks, andoutputs m_(2,p) to adder 715. In adder 715 m_(2,p) is subtracted from m₂and m_(1,2) is output, where m₂ was received from motion estimation unit710. m_(1,2) is then input to motion encoder-3 720 and {tilde over(m)}_(1,2) is output. Similarly, motion vectors for the odd macroblocks,m₂, are input to and coded by Motion-Encoder-2 735, and the coded bitsand reconstructed motions denoted by {tilde over (m)}_(2,1) and{circumflex over (m)}_(2,1), respectively, are output. The reconstructedmotions, {circumflex over (m)}_(2,1), are input to motion interpolator-2740 which interpolates motions in even macroblocks from the coded onesin odd macroblocks, and outputs m_(1,p) to adder 718. In adder 718m_(1,p) is subtracted from m₁ and m_(2,2) is output, where m₁ wasreceived from motion estimation unit 710. m_(2,2) is then input tomotion encoder-4 745 and {tilde over (m)}_(2,2) is output.

For a lossless description of motion, all of the four encoders involvedshould be lossless. An encoder is “lossless” when the decoder can createan exact reconstruction of the encoded signal, and an encoder is “lossy”when the decoder can not create an exact reconstruction of the encodedsignal. In accordance with an embodiment of the present invention,lossless coding is used for m₁ and m₂ and lossy coding is used form_(1,2) and m_(2,2).

The bits used for coding m_(1,2) and m_(2,2) are ignored when bothdescriptions are received and, therefore, are purely redundancy bits.This part of the redundancy for motion coding is denoted by ρ_(m,2). Theextra bits in independent coding of m₁ and m₂, compared to joint coding,contribute to the other portion of the redundancy. This is denoted byρ_(m,1).

In another embodiment of the present invention, conventional motionestimation is first performed to find the best motion vector for eachmacroblock. Then, the horizontal and vertical components of each motionvector are treated as two independent variables a (pre-whiteningtransform can be applied to reduce the correlation between the twocomponents) and generalized MDTC method is applied to each motionvector. Let m_(h), m_(v) represent the horizontal and vertical componentof a motion vector. Using a pairing transform, T, the transformedcoefficients are obtained from Equation (1):

$\begin{matrix}{\begin{bmatrix}m_{c} \\m_{d}\end{bmatrix} = {T\begin{bmatrix}m_{h} \\m_{v}\end{bmatrix}}} & (1)\end{matrix}$

Where {tilde over (m)}_(i,1)=1,2, represents the bits used to code m_(c)and m_(d), respectively, and m_(i),₂,i=1,2 represents the bits used tocode m_(c) ^(⊥) and m_(d) ^(⊥), the estimation error for m_(d) fromm_(c) and the estimation error for m_(c) from m_(d), respectively. Thetransform parameters in T are controlled based on the desiredredundancy.

In another embodiment of the present invention (not shown), eachhorizontal or vertical motion component is quantized using MDSQ toproduce two bit streams for all the motion vectors.

Application of MDTC to Block DCT Coding

The MDTC approach was originally developed and analyzed for an orderedset of N Gaussian variables with zero means and decreasing variances.When applying this approach to DCT coefficients of a macroblock (eitheran original or a prediction error macroblock), which are notstatistically stationary and are inherently two-dimensional, there aremany possibilities in terms of how to select and order coefficients topair. In the conventional run length coding approach for macroblock DCTcoefficients, used in all of the current video coding standards, eachelement of the two-dimensional DCT coefficient array is first quantizedusing a predefined quantization matrix and a scaling parameter. Thequantized coefficient indices are then converted into a one-dimensionalarray, using a predefined ordering, for example, the zigzag order. Forimage macroblocks, consecutive high frequency DCT coefficients tend tobe zero and, as a result, the run length coding method, which counts howmany zeros occur before a non-zero coefficient, has been devised. A pairof symbols, which consist of a run length value and the non-zero value,are then entropy coded.

In an embodiment of the present invention, to overcome thenon-stationarity of the DCT coefficients as described above, each imageis divided into macroblocks in a few classes so that the DCTcoefficients in each class are approximately stationary. For each class,the variances of the DCT coefficients are collected, and based on thevariances, the number of coefficients to pair, N, the pairing mechanismand the redundancy allocation are determined. These are determined basedon a theoretical analysis of the redundancy-rate-distortion performanceof MDTC. Specifically, the k-th largest coefficient in variance isalways paired with the (N−k)-th largest, with a fixed transformparameter prescribed by the optimal redundancy allocation. The operationfor macroblocks in each class is the same as that described above forthe implementation of EMDC. For a given macroblock, it is firsttransformed into DCT coefficients, quantized, and classified into one ofthe predefined classes. Then depending on the determined class, thefirst N DCT coefficients are paired and transformed using PCT, while therest are split even/odd, and appended to the PCT coefficients. Thecoefficients in each description (C coefficients and remaining evencoefficients, or D coefficients and remaining odd coefficients) usuallyhave many zeros. Therefore, the run length coding scheme is separatelyapplied to the two coefficient streams.

In an alternative embodiment of the present invention (not shown),instead of using a fixed pairing scheme for each macroblock in the sameclass, which could be pairing zero coefficients, a second option is tofirst determine any non-zero coefficients (after quantization), and thenapply MDTC only to the non-zero coefficients. In this embodiment, boththe location and the value of the non-zero coefficients need to bespecified in both descriptions. One implementation strategy is toduplicate the information characterizing the locations of the twocoefficients in both descriptions, but split the two coefficient valuesusing MDTC. A suitable pairing scheme is needed for the non-zerocoefficients. An alternative implementation strategy is to duplicatesome of the non-zero coefficients, while splitting the remaining one inan even/odd manner.

FIG. 9 is a flow diagram representation of an embodiment of an encoderoperation in accordance with the present invention. In FIG. 9, in block905 a sequence of video frames is received and in block 910 the frameindex value k is initialized to zero. In block 915 the next frame in thesequence of video frames is divided into a macroblock representation ofthe video frame. In an embodiment of the present invention, themacroblock is a 16×16 macroblock. Then, in block 920, for a firstmacroblock a decision is made on which mode will be used to code themacroblock. If the I-mode is selected in block 920, then, in block 925the 16×16 macroblock representation is divided into 8×8 blocks and inblock 930 DCT is applied to each of the 8×8 blocks and the resulting DCTcoefficients are passed to block 935. In an embodiment of the presentinvention, four 8×8 blocks are created to represent the luminancecharacteristics and two 8×8 blocks are created to represent thechromanance characteristics of the macroblock. In block 935, afour-variable transform is applied to the DCT coefficients to produce128 coefficients, which, in block 940, are decomposed into two sets of64 coefficients. The two sets of 64 coefficients are each run lengthcoded to form two separate descriptions in block 945. Then, in block950, each description is output to one of two channels. In block 952, acheck is made to determine if there are any more macroblocks in thecurrent video frame to be coded. If there are more macroblocks to becoded, then, the encoder returns to block 920 and continues with thenext macroblock. If there are not any more macro blocks to be coded inblock 952, then, in block 954 a check is made to determine if there areany more frames to be coded, and if there are not any more frames to becoded in block 954, then the encoder operation ends. If, in block 954,it is determined that there are more frames to be coded, then, in block955 the frame index k is incremented by 1 and operation returns to block915 to begin coding the next video frame.

If, in block 920, the P-mode is selected, then, in block 960, the threebest prediction macroblocks are determined with their correspondingmotion vectors and prediction errors using a reconstructed previousframe from both descriptions and zero, one or two of the reconstructedprevious frames from description one and description two. Then, in block965 for the three best macroblocks a decision is made on which mode willbe used to code the macroblocks. If the I-mode is selected in block 965,then, the macroblocks are coded using the method described above forblocks 925 through block 955. If the P-mode is selected in block 965,then, in block 970 each of the three prediction error macroblocks isdivided into a set of 8×8 blocks. In block 975, DCT is applied to eachof the three sets of 8×8 blocks to produce three sets of DCTcoefficients for each block and, then, in block 980, a four-variablepairing transform is applied to each of the three sets of DCTcoefficients for each block to produce three sets of 128 coefficients.Each of the three sets of 128 coefficients from block 980 are decomposedinto two sets of 64 coefficients in block 985 and the results areprovided to block 990. In block 990, up to two motion vectors and eachof the two sets of 64 coefficient are encoded using run-length coding toform two descriptions. Then, in block 950, each description is output toone of two channels. In block 952, a check is made to determine if thereare any more macroblocks in the current video frame to be coded. Ifthere are more macroblocks to be coded, then, the encoder returns toblock 920 and continues with the next macroblock. If there are not anymore macro blocks to be coded in block 952, then, in block 954 a checkis made to determine if there are any more frames to be coded, and ifthere are not any more frames to be coded in block 954, then the encoderoperation ends. If, in block 954, it is determined that there are moreframes to be coded, then, in block 955 the frame index k is incrementedby 1 and operation returns to block 915 to begin coding the next videoframe.

FIG. 10 is a flow diagram representation of the operations performed bya decoder when the decoder is receiving both descriptions, in accordancewith an embodiment of the present invention. In FIG. 10, in block 1005the frame index k is initialized to zero. Then, in block 1010, thedecoder receives bitstreams from both channels and in block 1015 thebitstreams are decoded to the macroblock level for each frame in thebitstreams. In block 1020, the mode to be used for a decoded macroblockis determined. If, in block 1020, the mode to be used for the macroblockis determined to be the I-mode, then, in block 1025 the macroblock isdecoded to the block level. In block 1030, each block from themacroblock is decoded into two sets of 64 coefficients, and in block1035 an inverse four-variable pairing transform is applied to each ofthe two sets of 64 coefficients to produce the DCT coefficients for eachblock. In block 1040, an inverse 8×8 DCT is applied to the DCTcoefficients for each block to produce four 8×8 blocks. Then, in block1045, the four 8×8 blocks are assembled into one 16×16 macroblock.

If, in block 1020, the mode to be used for the macroblock is determinedto be the P-mode, then, in block 1065, the motion vectors are decodedand a prediction macroblock is formed from a reconstructed previousframe from both descriptions. In block 1070 the prediction macroblockfrom block 1065 is decoded to the block level. Then, in block 1075, eachblock from the prediction macroblock is decoded into two sets of 64coefficients, and in block 1080 an inverse four-variable pairingtransform is applied to each of the two sets of coefficients to producethe DCT coefficients for each block. In block 1085, an inverse 8×8 DCTis applied to the DCT coefficients for each block to produce four 8×8blocks. Then, in block 1090, the four 8×8 blocks are assembled into one16×16 macroblock, and in block 1095 the 16×16 macroblock from block 1090is added to the prediction macroblock which was formed in block 1065.

Regardless of whether I-mode or P-mode decoding is used, after eitherblock 1045 or block 1095, in block 1050 the macroblocks from block 1045and block 1095 are assembled into a frame. Then, in block 1052, a checkis made to determine if there are any more macroblocks in the currentvideo frame to be decoded. If there are more macroblocks to be decoded,then, the decoder returns to block 1020 and continues with the nextmacroblock. If there are not any more macro blocks to be decoded inblock 1052, then, in block 1055, the frame is sent to the buffer forreconstructed frames from both descriptions. In block 1057 a check ismade to determine if there are any more frames to decode, and if thereare not any more frames to decode in block 1057, then the decoderoperation ends. If, in block 1057, it is determined that there are moreframes to decode, then, in block 1060 the frame index, k, is incrementedby one and the operation returns to block 1010 to continue decoding thebitstreams as described above.

FIG. 11 is a flow diagram representation of the operations performed bya decoder when the decoder is receiving only description one, inaccordance with an embodiment of the present invention. In FIG. 11, inblock 1105 the frame index k is initialized to zero. Then, in block1110, the decoder receives a single bitstream from channel one and inblock 1115 the bitstream is decoded to the macroblock level for eachframe in the video bitstream. In block 1120, the mode used for a decodedmacroblock is determined. If, in block 1120, the mode of the macroblockis determined to be the I-mode, then, in block 1125 the macroblock isdecoded to the block level. In block 1130, each block from themacroblock is decoded into two sets of 64 coefficients, and in block1132 an estimate for the two sets of 64 coefficients for the descriptionon channel two, which was not received, is produced for each block. Inblock 1135 an inverse four-variable pairing transform is applied to eachof the two sets of 64 coefficients to produce the DCT coefficients foreach block. In block 1140, an inverse 8×8 DCT is applied to the DCTcoefficients for each block to produce four 8×8 blocks. Then, in block1145, the four 8×8 blocks are assembled into a 16×16 macroblock.

If, in block 1120, the mode of the macroblock is determined to be theP-mode, then, in block 1165, up to two motion vectors are decoded and aprediction macroblock is formed from a reconstructed previous frame fromdescription one. In block 1170 the prediction macroblock from block 1165is decoded to the block level. Then, in block 1175, each block from theprediction macroblock is decoded into two sets of 64 coefficients, andin block 1177 an estimate for the two sets of 64 coefficients for thedescription on channel two, which was not received, is produced for eachblock. In block 1180 an inverse four-variable pairing transform isapplied to each of the two sets of 64 coefficients to produce the DCTcoefficients for each block. In block 1185, an inverse 8×8 DCT isapplied to the DCT coefficients for each block to produce four 8×8blocks. Then, in block 1190, the four 8×8 blocks are assembled into a16×16 macroblock, and in block 1195 the macroblock from block 1190 isadded to the prediction macroblock formed in block 1165.

Regardless of whether I-mode or P-mode decoding is used, after eitherblock 1145 or block 1195, in block 1150 the macroblocks from block 1145and block 1195 are assembled into a frame. In block 1152, a check ismade to determine if there are any more macroblocks in the current videoframe to be decoded. If there are more macroblocks to be decoded, then,the decoder returns to block 1120 and continues with the nextmacroblock. If there are not any more macro blocks to be decoded inblock 1152, then, in block 1155, the frame is sent to the buffer forreconstructed frames from description one. In block 1157 a check is madeto determine if there are any more frames to decode, and if there arenot any more frames to decode in block 1157, then the decoder operationends. If, in block 1157, it is determined that there are more frames todecode, then, in block 1160 the frame index, k, is incremented by oneand the operation returns to block 1110 to continue decoding thebitstream as described above.

While the decoder method of operations shown in FIG. 11, and describedabove, are directed to an embodiment in which the decoder is onlyreceiving description one, the method is equally applicable when onlydescription two is being received. The modifications that are requiredmerely involve changing block 1110 to receive the bitstream from channeltwo; changing block 1165 to form the prediction macroblock fromreconstructed previous frame from description two; and changing blocks1132 and 1177 to estimate the coefficients sent on channel one.

In the foregoing detailed description and figures, several embodimentsof the present invention are specifically illustrated and described.Accordingly, it will be appreciated that modifications and variations ofthe present invention are covered by the above teachings and within thepurview of the appended claims without departing from the spirit andintended scope of the invention.

1. An encoder comprising: a first module responsive to an applied videoframe that divides the video frame into non-overlapping macroblocks, anintra-frame coding (I-mode) module that develops multiple descriptionsof an applied macroblock; a predictive coding (P-mode) module thatdevelops multiple predictive descriptions of said applied macroblock; amode selection element that applies each of the macroblocks created bysaid first module to either said I-mode module or to said P-mode module;a buffer, connected to said mode selector element, for storingmacroblocks that were reconstructed by decoding macroblocks ofpreviously applied video frames, which had been encoded, and a ratecontrol unit that is connected to said mode selection element, saidI-mode module and said P-mode module, for influencing whether said modeselection element applies the macroblocks created by said first moduleto said I-mode module or to said P-mode module.
 2. The encoder of claim1 where said buffer stored reconstructed macroblocks from a plurality ofpreviously applied frames.
 3. The encoder of claim 1 where said P-modemodule encodes error between the applied macroblock and a macroblockfrom a previous frame that is used to predict said applied macroblock(best matching macroblock).
 4. The encoder of claim 3 where said P-modemodule also encodes a motion vector that describes displacement betweensaid applied macroblock and said best matching macroblock.
 5. Theencoder of claim 4 where both the encoded error and the encoded motionvector are coded into said multiple descriptions.
 6. The encoder ofclaim 1 where said mode selection element applies a macroblock of saidcreated macroblocks to said I-mode module periodically.
 7. The encoderof claim 6 where said mode selection element applies a macroblock ofsaid created macroblocks to said I-mode module every x macroblocks thatare routed to said P-mode module, where x is a number between 10 and 15.8. The encoder of claim 1 where said rate control unit influences choiceby said mode selection element as to whether to apply a macroblock tosaid I-mode module or to said P-mode module.
 9. The encoder of claim 1further comprising a redundancy allocation module, responsive to saidI-mode module and to said P-mode module, for affecting the applyingperformed of said mode selection element based on desired redundancy.10. The encoder of claim 1 where said buffer stores macroblocks of videoframes that were reconstructed by using a first of said multipledescriptions, ψ₁, macroblocks of said video frames that werereconstructed by using a second of said multiple descriptions, ψ₂, andmacroblocks of said video frames that were reconstructed by using all ofsaid multiple descriptions, ψ₀.
 11. The encoder of claim 10 where saidP-mode module comprises a motion estimation coding element that, basedon said ψ₁, ψ₂, and ψ₀ macroblocks and on said applied macroblock,develops multiple description coded motion vectors for said appliedmacroblock; a first motion compensated predictor responsive to a firstof said multiple description coded motion vectors and to said ψ₁macroblocks for developing code P1, a second motion compensatedpredictor responsive to a second of said multiple description codedmotion vectors and to said ψ₂ macroblocks for developing code P2, athird motion compensated predictor responsive to all of said multipledescription coded motion vectors and to said ψ₁, ψ₂, and ψ₀ macroblocksfor developing code P0; and a multiple description encoder responsive tosaid P1, P2 and P0 for developing multiple prediction errordescriptions.